41 |
CNGL: Grading student answers by acts of translation
|
|
|
|
In: Bicici, Ergun orcid:0000-0002-2293-2031 and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) CNGL: Grading student answers by acts of translation. In: SEMEVAL, 14-15 Jun 2013, Atlanta, Georgia. (2013)
|
|
BASE
|
|
Show details
|
|
42 |
Definition of interfaces
|
|
|
|
In: Almaghout, Hala, Bicici, Ergun, Doherty, Stephen orcid:0000-0003-0887-1049 , Gaspari, Federico, Groves, Declan, Toral, Antonio orcid:0000-0003-2357-2960 , van Genabith, Josef orcid:0000-0003-1322-7944 , Popović, Maja orcid:0000-0001-8234-8745 and Piperidis, Stelios (2013) Definition of interfaces. Project Report. QTLaunchPad. (2013)
|
|
BASE
|
|
Show details
|
|
43 |
Mapping the industry I: Findings on translation technologies and quality assessment
|
|
|
|
In: Doherty, Stephen orcid:0000-0003-0887-1049 , Gaspari, Federico, Groves, Declan and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Mapping the industry I: Findings on translation technologies and quality assessment. Technical Report. GALA. (2013)
|
|
BASE
|
|
Show details
|
|
44 |
Quality metrics for human and machine translation.
|
|
|
|
In: Doherty, Stephen orcid:0000-0003-4864-5986 , Gaspari, Federico, Groves, Declan, Srivastava, Ankit Kumar and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Quality metrics for human and machine translation. Project Report. UNSPECIFIED. (2013)
|
|
BASE
|
|
Show details
|
|
45 |
CNGL-CORE: Referential translation machines for measuring semantic similarity
|
|
|
|
In: Bicici, Ergun orcid:0000-0002-2293-2031 and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) CNGL-CORE: Referential translation machines for measuring semantic similarity. In: *SEM, 13-14 Jun 2013, Atlanta, Georgia. (2013)
|
|
BASE
|
|
Show details
|
|
46 |
Working with a small dataset - semi-supervised dependency parsing for Irish
|
|
|
|
BASE
|
|
Show details
|
|
47 |
A corpus-based finite-state morphological toolkit for contemporary arabic
|
|
|
|
BASE
|
|
Show details
|
|
48 |
Detecting grammatical errors with treebank-induced, probabilistic parsers
|
|
|
|
In: Wagner, Joachim orcid:0000-0002-8290-3849 (2012) Detecting grammatical errors with treebank-induced, probabilistic parsers. PhD thesis, Dublin City University. (2012)
|
|
BASE
|
|
Show details
|
|
49 |
Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification
|
|
|
|
In: Tu, Zhaopeng, He, Yifan, Foster, Jennifer orcid:0000-0002-7789-4853 , van Genabith, Josef orcid:0000-0003-1322-7944 , Liu, Qun and Shouxun, Lin (2012) Identifying high-impact sub-structures for convolution kernels in document-level sentiment classification. In: Annual Meeting of the Association for Computational Linguistics (ACL 2012), 9-11 Jul 2012, Jelu, Korea. (2012)
|
|
BASE
|
|
Show details
|
|
50 |
Irish treebanking and parsing: a preliminary evaluation
|
|
|
|
In: Lynn, Teresa, Cetinoglu, Ozlem, Foster, Jennifer orcid:0000-0002-7789-4853 , Uí Dhonnchadha, Elaine orcid:0000-0003-3448-4288 , Dras, Mark orcid:0000-0001-9908-7182 and van Genabith, Josef orcid:0000-0003-1322-7944 (2012) Irish treebanking and parsing: a preliminary evaluation. In: International Conference on Linguistic Resources and Evaluation, 21-27 May 2012, Istanbul, Turkey. (2012)
|
|
BASE
|
|
Show details
|
|
52 |
Decreasing lexical data sparsity in statistical syntactic parsing - experiments with named entities
|
|
|
|
In: Hogan, Deirdre, Foster, Jennifer orcid:0000-0002-7789-4853 and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) Decreasing lexical data sparsity in statistical syntactic parsing - experiments with named entities. In: Multiword Expressions: from Parsing and Generation to the Real World (MWE). Workshop at ACL 2011, 19-24 June 2011, Portland, Oregon. (2011)
|
|
BASE
|
|
Show details
|
|
53 |
Deep Syntax in Statistical Machine Translation
|
|
Graham, Yvette. - : Dublin City University. National Centre for Language Technology (NCLT), 2011. : Dublin City University. School of Computing, 2011
|
|
In: Graham, Yvette (2011) Deep Syntax in Statistical Machine Translation. PhD thesis, Dublin City University. (2011)
|
|
Abstract:
Statistical Machine Translation (SMT) via deep syntactic transfer employs a three-stage architecture, (i) parse source language (SL) input, (ii) transfer SL deep syntactic structure to the target language (TL), and (iii) generate a TL translation. The deep syntactic transfer architecture achieves a high level of language pair independence compared to other Machine Translation (MT) approaches, as translation is carried out at the more language independent deep syntactic representation. TL word order can be generated independently of SL word order and therefore no reordering model between source and target words is required. In addition, words in dependency relations are adjacent in the deep syntactic structure, allowing the extraction of more general transfer rules, compared to other rules/phrases extracted from the surface form corpus, as such words are often distant in surface form strings, as well as allowing the use of a TL deep syntax language model, which models a deeper notion of fluency than a string-based language model and may lead to better lexical choice. The deep syntactic representation also contains words in lemma form with morpho-syntactic information, and this enables new inflections of lemmas not observed in bilingual training data, that are out of coverage for other SMT approaches, to fall within coverage of deep syntactic transfer. In this thesis, we adapt existing methods already successful in Phrase-Based SMT (PB-SMT) to deep syntactic transfer as well as presenting new methods of our own. We present a new definition for consistent deep syntax transfer rules, inspired by the definition for a consistent phrase in PB-SMT, and we extract all rules consistent with the node alignment, as smaller rules provide high coverage of unseen data, while larger rules provide more fluent combinations of TL words. Since large numbers of consistent transfer rules exist per sentence pair, we also provide an efficient method of extracting rules as well as an efficient method of storing them. We also present a deep syntax translation model, as in other SMT approaches, we use a log-linear combination of features functions, and include a translation model computed from relative frequencies of transfer rules, lexical weighting, as well as a deep syntax language model and string-based language model. In addition, we describe methods of carrying out transfer decoding, the search for TL deep syntactic structures, and how we efficiently integrate a deep syntax trigram language model to decoding, as well as methods of translating morpho-syntactic information separately from lemmas, using an adaptation of Factored Models. Finally, we include an experimental evaluation, in which we compare MT output for different configurations of our SMT via deep syntactic transfer system. We investigate various methods of word alignment, methods of translating morpho-syntactic information, limits on transfer rule size, different beam sizes during transfer decoding, generating from different sized lists of TL decoder output structures, as well as deterministic versus non-deterministic generation. We also include an evaluation of the deep syntax language model in isolation to the MT system and compare it to a string-based language model. Finally, we compare the performance and types of translations our system produces with a state-of-the-art phrase-based statistical machine translation system and although the deep syntax system in general currently under-performs, it does achieve state-of-the-art performance for translation of a specific syntactic construction, the compound noun, and for translations within coverage of the TL precision grammar used for generation. We provide the software for transfer rule extraction, as well as the transfer decoder, as open source tools to assist future research.
|
|
Keyword:
Lexical Functional Grammar; Machine translating
|
|
URL: http://doras.dcu.ie/16078/
|
|
BASE
|
|
Hide details
|
|
54 |
Treebank-Based Deep Grammar Acquisition for French Probabilistic Parsing Resources
|
|
Schluter, Natalie. - : Dublin City University. National Centre for Language Technology (NCLT), 2011. : Dublin City University. School of Computing, 2011
|
|
In: Schluter, Natalie (2011) Treebank-Based Deep Grammar Acquisition for French Probabilistic Parsing Resources. PhD thesis, Dublin City University. (2011)
|
|
BASE
|
|
Show details
|
|
55 |
Comparing the use of edited and unedited text in parser self-training
|
|
|
|
In: Foster, Jennifer orcid:0000-0002-7789-4853 , Cetinoglu, Ozlem, Wagner, Joachim orcid:0000-0002-8290-3849 and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) Comparing the use of edited and unedited text in parser self-training. In: The 12th International Conference on Parsing Technologies (IWPT 2011), 05-07 Oct 2011, Dublin, Ireland. ISBN 978-1-932432-04-6 (2011)
|
|
BASE
|
|
Show details
|
|
56 |
From news to comment: Resources and benchmarks for parsing the language of web 2.0
|
|
|
|
In: Foster, Jennifer orcid:0000-0002-7789-4853 , Cetinoglu, Ozlem, Wagner, Joachim orcid:0000-0002-8290-3849 , Le Roux, Joseph, Nivre, Joakim, Hogan, Deirdre and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) From news to comment: Resources and benchmarks for parsing the language of web 2.0. In: The 5th International Joint Conference on Natural Language Processing (IJCNLP), 08-13 Nov 2011, Chiang Mai, Thailand. ISBN 978-974-466-564-5 (2011)
|
|
BASE
|
|
Show details
|
|
57 |
#hardtoparse: POS tagging and parsing the twitterverse
|
|
|
|
In: Foster, Jennifer orcid:0000-0002-7789-4853 , Cetinoglu, Ozlem, Wagner, Joachim orcid:0000-0002-8290-3849 , Le Roux, Joseph, Hogan, Stephen, Nivre, Joakim, Hogan, Deirdre and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) #hardtoparse: POS tagging and parsing the twitterverse. In: The AAAI-11 Workshop on Analyzing Microtext, 8 Aug 2011, San Francisco, CA. (2011)
|
|
BASE
|
|
Show details
|
|
58 |
The integration of machine translation and translation memory
|
|
He, Yifan. - : Dublin City University. Centre for Next Generation Localisation (CNGL), 2011. : Dublin City University. School of Computing, 2011
|
|
In: He, Yifan (2011) The integration of machine translation and translation memory. PhD thesis, Dublin City University. (2011)
|
|
BASE
|
|
Show details
|
|
59 |
Improving dependency label accuracy using statistical post-editing: A cross-framework study
|
|
|
|
In: Cetinoglu, Ozlem, Bryl, Anton, Foster, Jennifer orcid:0000-0002-7789-4853 and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) Improving dependency label accuracy using statistical post-editing: A cross-framework study. In: International Conference on Dependency Linguistics (DepLing), 5-7 Sept 2011, Barcelona, Spain. (2011)
|
|
BASE
|
|
Show details
|
|
60 |
f-align: An open-source alignment tool for LFG f-structures
|
|
|
|
In: Bryl, Anton and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) f-align: An open-source alignment tool for LFG f-structures. In: AMTA, 31 Oct - 4th Nov 2010, Denver, Colorado. (2010)
|
|
BASE
|
|
Show details
|
|
|
|